Skip to content

Conversation

ecnelises
Copy link
Member

This actually reverts 4181205.

The original commit omits unit with all symbols inlined into current one, which leads to crash when a module using split-dwarf inlined a function from another module with mismatched split-dwarf-inlining option. This revert guarantees that DIEs are created in both DWO and the skeleton sections whenever split-dwarf is active.

@llvmbot
Copy link
Member

llvmbot commented Aug 14, 2025

@llvm/pr-subscribers-debuginfo

Author: Qiu Chaofan (ecnelises)

Changes

This actually reverts 4181205.

The original commit omits unit with all symbols inlined into current one, which leads to crash when a module using split-dwarf inlined a function from another module with mismatched split-dwarf-inlining option. This revert guarantees that DIEs are created in both DWO and the skeleton sections whenever split-dwarf is active.


Full diff: https://github.com/llvm/llvm-project/pull/153568.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+8-12)
  • (modified) llvm/test/DebugInfo/X86/split-dwarf-omit-empty.ll (+1-1)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index c27f100775625..afa8511981e23 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -558,18 +558,14 @@ void DwarfDebug::constructAbstractSubprogramScopeDIE(DwarfCompileUnit &SrcCU,
 
   // Find the subprogram's DwarfCompileUnit in the SPMap in case the subprogram
   // was inlined from another compile unit.
-  if (useSplitDwarf() && !shareAcrossDWOCUs() && !SP->getUnit()->getSplitDebugInlining())
-    // Avoid building the original CU if it won't be used
-    SrcCU.constructAbstractSubprogramScopeDIE(Scope);
-  else {
-    auto &CU = getOrCreateDwarfCompileUnit(SP->getUnit());
-    if (auto *SkelCU = CU.getSkeleton()) {
-      (shareAcrossDWOCUs() ? CU : SrcCU)
-          .constructAbstractSubprogramScopeDIE(Scope);
-      if (CU.getCUNode()->getSplitDebugInlining())
-        SkelCU->constructAbstractSubprogramScopeDIE(Scope);
-    } else
-      CU.constructAbstractSubprogramScopeDIE(Scope);
+  auto &CU = getOrCreateDwarfCompileUnit(SP->getUnit());
+  if (auto *SkelCU = CU.getSkeleton()) {
+    (shareAcrossDWOCUs() ? CU : SrcCU)
+        .constructAbstractSubprogramScopeDIE(Scope);
+    if (CU.getCUNode()->getSplitDebugInlining())
+      SkelCU->constructAbstractSubprogramScopeDIE(Scope);
+  } else {
+    CU.constructAbstractSubprogramScopeDIE(Scope);
   }
 }
 
diff --git a/llvm/test/DebugInfo/X86/split-dwarf-omit-empty.ll b/llvm/test/DebugInfo/X86/split-dwarf-omit-empty.ll
index f4cee1ec78b1d..95aae44b679d5 100644
--- a/llvm/test/DebugInfo/X86/split-dwarf-omit-empty.ll
+++ b/llvm/test/DebugInfo/X86/split-dwarf-omit-empty.ll
@@ -38,7 +38,7 @@ entry:
 !0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "clang version 5.0.0 (trunk 304054) (llvm/trunk 304080)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, splitDebugInlining: false)
 !1 = !DIFile(filename: "a.cpp", directory: "/usr/local/google/home/blaikie/dev/scratch")
 !2 = !{}
-!3 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !4, producer: "clang version 5.0.0 (trunk 304054) (llvm/trunk 304080)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, splitDebugInlining: false)
+!3 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !4, producer: "clang version 5.0.0 (trunk 304054) (llvm/trunk 304080)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, splitDebugInlining: true)
 !4 = !DIFile(filename: "b.cpp", directory: "/usr/local/google/home/blaikie/dev/scratch")
 !5 = !{!"clang version 5.0.0 (trunk 304054) (llvm/trunk 304080)"}
 !6 = !{i32 2, !"Dwarf Version", i32 4}

@dwblaikie
Copy link
Collaborator

Hrm - could you describe this a bit more verbosely/with an example perhaps? I can't quite picture it straight-off.

Example of a case where the previous optimization continues to apply - and an example where this bug shows up.

Are there any false positives/false negatives? (do we end up producing empty CUs under certain circumstances with this new patch applied?)

@Sockke Sockke self-requested a review August 18, 2025 10:27
This actually reverts 4181205.

The original commit omits unit with all symbols inlined into current
one, which leads to crash when a module using split-dwarf inlined a
function from another module with mismatched split-dwarf-inlining
option. This revert guarantees that DIEs are created in both DWO and
the skeleton sections whenever split-dwarf is active.
@ecnelises
Copy link
Member Author

hi @dwblaikie

This comes from a real-world thin-lto case which triggers a crash: A static library containing bitcode files were compiled with debug info but split-inlining is false. The library is linked into final application builtin with -gsplit-dwarf (split-inlining=true).

In such case, when a function from the bitcode archive is inlined, compiler sees splitDebugInlining is false in the original CU, so incorrectly skips emitting an abstract origin DIE for that function into the skeleton. However, the application's CU requires the skeleton DIE to generate its own entry for inlined_subroutine, then crash happens. My added testcase shows the crash:

Assertion failed: (OriginDIE && "Unable to find original DIE for an inlined subprogram."), function constructInlinedScopeDIE, file DwarfCompileUnit.cpp, line 741.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
1.	Running pass 'Function Pass Manager' on module 'llvm/test/DebugInfo/X86/split-dwarf-inline.ll'.
2.	Running pass 'X86 Assembly Printer' on function '@caller_func'
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  llc                      0x0000000106252104 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 56
1  llc                      0x000000010624fe90 llvm::sys::RunSignalHandlers() + 172
2  llc                      0x0000000106252bfc SignalHandler(int, __siginfo*, void*) + 360
3  libsystem_platform.dylib 0x000000019bdf16a4 _sigtramp + 56
4  libsystem_pthread.dylib  0x000000019bdb788c pthread_kill + 296
5  libsystem_c.dylib        0x000000019bcc0a3c abort + 124
6  libsystem_c.dylib        0x000000019bcbfc70 err + 0
7  llc                      0x0000000106fdd0c8 llvm::DwarfCompileUnit::constructInlinedScopeDIE(llvm::LexicalScope*, llvm::DIE&) (.cold.9) + 0
8  llc                      0x0000000105404234 llvm::DwarfCompileUnit::constructInlinedScopeDIE(llvm::LexicalScope*, llvm::DIE&) + 704
9  llc                      0x0000000105403ee8 llvm::DwarfCompileUnit::constructScopeDIE(llvm::LexicalScope*, llvm::DIE&) + 88
10 llc                      0x0000000105405174 llvm::DwarfCompileUnit::createAndAddScopeChildren(llvm::LexicalScope*, llvm::DIE&) + 3868
11 llc                      0x0000000105407314 llvm::DwarfCompileUnit::constructSubprogramScopeDIE(llvm::DISubprogram const*, llvm::LexicalScope*, llvm::MCSymbol*) + 80
12 llc                      0x0000000105419f70 llvm::DwarfDebug::endFunctionImpl(llvm::MachineFunction const*) + 1672
13 llc                      0x00000001053f942c llvm::DebugHandlerBase::endFunction(llvm::MachineFunction const*) + 64
14 llc                      0x00000001053dea40 llvm::AsmPrinter::emitFunctionBody() + 14580
15 llc                      0x0000000104d2efec llvm::X86AsmPrinter::runOnMachineFunction(llvm::MachineFunction&) + 396
16 llc                      0x000000010563a714 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 784
17 llc                      0x0000000105a8f338 llvm::FPPassManager::runOnFunction(llvm::Function&) + 656
18 llc                      0x0000000105a9649c llvm::FPPassManager::runOnModule(llvm::Module&) + 60
19 llc                      0x0000000105a8fbec llvm::legacy::PassManagerImpl::run(llvm::Module&) + 1528

@ecnelises
Copy link
Member Author

Gentle ping

@dwblaikie dwblaikie removed the request for review from joker-eph September 16, 2025 20:32
@dwblaikie
Copy link
Collaborator

So, if I recall correctly, the original patch (just linking it here for my own benefit, passing through the recommit linked above: f2f898a ) addressed a real scalability problem with DWARF and ThinLTO.

Specifically, using split-dwarf-inlining with ThinLTO meant creating a lot of tiny CUs for all the imported CUs in each backend compile. This massively increases the number of CUs and causes storage and debugger scalability issues.

So, I guess a few questions, but maybe the simplest to start with: if this reverts the original patch, how does it not regress the original test case: llvm/test/DebugInfo/X86/split-dwarf-omit-empty.ll ? Is the functionality preserved through some other means now?

If the originally tested/desired behavior is preserved otherwise, that seems fine?

Might be worth testing the original revision and one before it - perhaps I made a mistake in the test case, and it doesn't actually represent behavior fixed by the original patch? (though I'm pretty sure there was some functional change that was important in that patch - so if the test doesn't cover that functionality, it probably suggests we need a better test - not that the original patch is fine to revert (& that a better test might demonstrate problems with this new patch, perhaps, that we need to figure out how to account for))

@Sockke
Copy link
Contributor

Sockke commented Sep 17, 2025

Specifically, using split-dwarf-inlining with ThinLTO meant creating a lot of tiny CUs for all the imported CUs in each backend compile. This massively increases the number of CUs and causes storage and debugger scalability issues.

I see that e731a26 seems to have handled this to avoid creating a lot of tiny CUs for all the imported CUs in each backend compile in this scenario.

  if (useSplitDwarf() &&
      !shareAcrossDWOCUs() &&
      (!DIUnit->getSplitDebugInlining() ||
       DIUnit->getEmissionKind() == DICompileUnit::FullDebug) &&
      !CUMap.empty()) {
    return *CUMap.begin()->second;
  }

Please correct me if I'm mistaken.

So, I guess a few questions, but maybe the simplest to start with: if this reverts the original patch, how does it not regress the original test case: llvm/test/DebugInfo/X86/split-dwarf-omit-empty.ll ? Is the functionality preserved through some other means now?

To be precise, this change isn't a simple revert. It enables correct handling of the case where a function from a CU with splitDebugInlining=false is inlined into a CU with splitDebugInlining=true.

@dwblaikie
Copy link
Collaborator

Specifically, using split-dwarf-inlining with ThinLTO meant creating a lot of tiny CUs for all the imported CUs in each backend compile. This massively increases the number of CUs and causes storage and debugger scalability issues.

I see that e731a26 seems to have handled this to avoid creating a lot of tiny CUs for all the imported CUs in each backend compile in this scenario.

  if (useSplitDwarf() &&
      !shareAcrossDWOCUs() &&
      (!DIUnit->getSplitDebugInlining() ||
       DIUnit->getEmissionKind() == DICompileUnit::FullDebug) &&
      !CUMap.empty()) {
    return *CUMap.begin()->second;
  }

Please correct me if I'm mistaken.

That sounds plausible/vaguely like my recollection from way-back when. Thanks for doing the archaeology here.

So, I guess a few questions, but maybe the simplest to start with: if this reverts the original patch, how does it not regress the original test case: llvm/test/DebugInfo/X86/split-dwarf-omit-empty.ll ? Is the functionality preserved through some other means now?

To be precise, this change isn't a simple revert. It enables correct handling of the case where a function from a CU with splitDebugInlining=true is inlined into a CU with splitDebugInlining=false.

Not sure I follow - looking at the text of the change it does look like a revert. Is there some difference in the code this patch is adding compared to the code that was removed back in the original?

It's probably fine that it's a straight revert - given the functionality of the original patch was fixed more broadly with the other patch you mentioned.

@Sockke
Copy link
Contributor

Sockke commented Sep 18, 2025

Not sure I follow - looking at the text of the change it does look like a revert. Is there some difference in the code this patch is adding compared to the code that was removed back in the original?

  if (useSplitDwarf() && !shareAcrossDWOCUs() && !SP->getUnit()->getSplitDebugInlining())
    // Avoid building the original CU if it won't be used
    SrcCU.constructAbstractSubprogramScopeDIE(Scope);

it will only generate an abstract subprogram in the current DWO for the inlined function of the CU without -fsplit-dwarf-inlining, but not in the skeleton.

  if (auto *SkelCU = TheCU.getSkeleton())
    if (!LScopes.getAbstractScopesList().empty() &&
        TheCU.getCUNode()->getSplitDebugInlining())
      SkelCU->constructSubprogramScopeDIE(SP, FnScope, FunctionLineTableLabel);

This will trigger an assertion error when creating a child DIE for the inlined function in the skeleton by the CU with -fsplit-dwarf-inlining, indicating that the parent abstract subprogram in the skeleton is null.

 /*
  if (useSplitDwarf() && !shareAcrossDWOCUs() && !SP->getUnit()->getSplitDebugInlining())
    // Avoid building the original CU if it won't be used
    SrcCU.constructAbstractSubprogramScopeDIE(Scope);
 */

Remove this if-statement logic to ensure that the abstract subprogram is properly generated in the skeleton unit, while e731a26 ensures that redundant tiny CUs will not be generated.

I think the logic is roughly like that.

@ecnelises
Copy link
Member Author

I see that e731a26 seems to have handled this to avoid creating a lot of tiny CUs for all the imported CUs in each backend compile in this scenario.

Thanks! Yes, e731a26 already addressed the issue of split-dwarf-omit-empty.ll added by 4181205.

Not sure I follow - looking at the text of the change it does look like a revert. Is there some difference in the code this patch is adding compared to the code that was removed back in the original?

Technically this is a revert. I did not try an old enough LLVM to exactly contain the commit and test whether it has the ICE at that time. Anyway since the original issue is resolved by a later commit (e731a26), and reverting this fixes another ICE, this can also be regarded as a fix, I think which is what @Sockke means. :-)

Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - thanks for walking me through it, doing the archaeology, etc.

@dwblaikie dwblaikie merged commit e8311f8 into llvm:main Sep 18, 2025
9 checks passed
@ecnelises ecnelises deleted the dwarf_split_gen_fix branch September 19, 2025 02:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants